#### Code Generation Part I

1

Chapter 9

# Position of a Code Generator in the Compiler Model



### Code Generation

- Code produced by compiler must be correct

   Source to target program transformation is
   *semantics preserving*
- Code produced by compiler should be of high quality
  - Effective use of target machine resources
  - Heuristic techniques can generate good but suboptimal code, because generating optimal code is undecidable

## Target Program Code

- The back-end code generator of a compiler may generate different forms of code, depending on the requirements:
  - Absolute machine code (executable code)
  - Relocatable machine code (object files for linker)
  - Assembly language (facilitates debugging)
  - Byte code forms for interpreters (e.g. JVM)

## The Target Machine

- Implementing code generation requires thorough understanding of the target machine architecture and its instruction set
- Our (hypothetical) machine:
  - Byte-addressable (word = 4 bytes)
  - Has *n* general purpose registers **R0**, **R1**, ..., **R***n*-1
  - Two-address instructions of the form

op source, destination

## The Target Machine: Op-codes and Address Modes

- Op-codes (*op*), for example
   MOV (move content of *source* to *destination*)
   ADD (add content of *source* to *destination*)
   SUB (subtract content of *source* from *dest*.)
- Address modes

| Mode              | Form                    | Address                            | Added Cost |
|-------------------|-------------------------|------------------------------------|------------|
| Absolute          | М                       | Μ                                  | 1          |
| Register          | R                       | R                                  | 0          |
| Indexed           | $C(\mathbf{R})$         | c+contents( <b>R</b> )             | 1          |
| Indirect register | *R                      | contents(R)                        | 0          |
| Indirect indexed  | * <i>C</i> ( <b>R</b> ) | <pre>contents(c+contents(R))</pre> | 1          |
| Literal           | <b>#</b> <i>C</i>       | N/A                                | 1          |

#### Instruction Costs

- Machine is a simple, non-super-scalar processor with fixed instruction costs
- Realistic machines have deep pipelines, I-cache, D-cache, etc.
- Define the cost of instruction

= 1 + cost(*source*-mode) + cost(*destination*-mode)

## Examples

| Instru | iction        | Operation                                                                                 | Cost |
|--------|---------------|-------------------------------------------------------------------------------------------|------|
| MOV    | R0,R1         | Store <i>content</i> (R0) into register R1                                                | 1    |
| MOV    | R0,M          | Store <i>content</i> ( <b>R0</b> ) into memory location <b>M</b>                          | 2    |
| MOV    | M, R0         | Store <i>content</i> (M) into register R0                                                 | 2    |
| MOV    | 4(R0),M       | Store <i>contents</i> (4+ <i>contents</i> ( <b>R0</b> )) into <b>M</b>                    | 3    |
| MOV    | *4(R0),M      | Store <i>contents</i> ( <i>contents</i> (4+ <i>contents</i> ( <b>R0</b> ))) into <b>M</b> | 3    |
| MOV    | #1,R0         | Store 1 into R0                                                                           | 2    |
| ADD    | 4(R0),*12(R1) | Add contents(4+contents(R0))                                                              |      |
|        |               | to <i>contents</i> (12+ <i>contents</i> ( <b>R1</b> ))                                    | 3    |

#### Instruction Selection

- Instruction selection is important to obtain efficient code
- Suppose we translate three-address code

X := Y + Z

to: MOV 
$$y, R0$$
  
ADD  $z, R0$   
MOV  $R0, X$   
 $a:=a+1$   $\longrightarrow$  MOV  $a, R0$   
ADD  $\#1, R0$   
MOV  $R0, a$   
 $Cost = 6$   
 $Better$   
 $ADD \#1, a$   
 $Cost = 3$   
 $Cost = 2$ 

# Instruction Selection: Utilizing Addressing Modes

- Suppose we translate **a**:=**b**+**c** into
  - MOV b,R0 ADD c,R0
  - MOV R0,a
- Assuming addresses of a, b, and c are stored in R0, R1, and R2

MOV \*R1,\*R0

ADD \*R2,\*R0

 Assuming R1 and R2 contain values of b and с ADD R2,R1 MOV R1,a

# Need for Global Machine-Specific Code Optimizations

• Suppose we translate three-address code

x := y + z

- to: MOV y, R0 ADD z, R0 MOV R0, X
- Then, we translate



# Register Allocation and Assignment

- Efficient utilization of the limited set of registers is important to generate good code
- Registers are assigned by
  - *Register allocation* to select the set of variables that will reside in registers at a point in the code
  - *Register assignment* to pick the specific register that a variable will reside in
- Finding an optimal register assignment in general is NP-complete

## Example

| t:=a+b | t:=a*b |
|--------|--------|
| t:=t*c | t:=t+a |
| t:=t/d | t:=t/d |

MOV a,R1 ADD b,R1 MUL c,R1 DIV d,R1 MOV R1,t :=t/d

MOV a,R0 MOV R0,R1 MUL b,R1 ADD R0,R1 DIV d,R1 MOV R1,t

## Choice of Evaluation Order

• When instructions are independent, their evaluation order can be changed



## Generating Code for Stack Allocation of Activation Records

| t1 := a + b      | 100: ADD #16,SP    | Push frame           |
|------------------|--------------------|----------------------|
| param t1         | 108: MOV a,R0      |                      |
| param c          | 116: ADD b,R0      |                      |
| t2 := call foo,2 | 124: MOV R0,4(SP)  | Store a+b            |
|                  | 132: MOV c,8(SP)   | Store c              |
|                  | 140: MOV #156,*SP  | Store return address |
|                  | 148: GOTO 500      | Jump to foo          |
| func foo         | 156: MOV 12(SP),R0 | Get return value     |
|                  | 164: SUB #16,SP    | Remove frame         |
| return t1        | 172:               |                      |
|                  |                    |                      |
|                  | 500:               |                      |
|                  | 564: MOV R0,12(SP) | Store return value   |
|                  | 572: GOTO *SP      | Return to caller     |
|                  |                    |                      |

Note: Language and machine dependent Here we assume C-like implementation with SP and no FP